Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 cutover #36

Merged
merged 4 commits into from
Apr 4, 2024
Merged

v2 cutover #36

merged 4 commits into from
Apr 4, 2024

Conversation

dogversioning
Copy link
Contributor

Description

This PR cuts over fields to the v2 library field names

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added
  • Run pylint if you're making changes beyond adding studies

@dogversioning dogversioning marked this pull request as ready for review April 4, 2024 12:36
@dogversioning
Copy link
Contributor Author

in the absence of unit tests, here are some counts from the previous run (before the additional data we added) and the new version:


#   table_name  con
1   covid_symptom__count_dx_month   458
2   covid_symptom__count_dx_week    452
3   covid_symptom__count_pcr_month  0
4   covid_symptom__count_pcr_week   0
5   covid_symptom__count_prevalence_ed_month    0
6   covid_symptom__count_prevalence_ed_week 0
7   covid_symptom__count_study_period_month 11952
8   covid_symptom__count_study_period_week  39720
9   covid_symptom__count_symptom_month  164212
10  covid_symptom__count_symptom_week   271594
11  covid_symptom__define_age   111
12  covid_symptom__define_dx_icd10  1
13  covid_symptom__define_dx_snomed 23
14  covid_symptom__define_pcr   52
15  covid_symptom__define_pcr_custom    4
16  covid_symptom__define_pcr_negative  22
17  covid_symptom__define_pcr_positive  30
18  covid_symptom__define_period    3
19  covid_symptom__define_symptom   155
20  covid_symptom__dx   1734
21  covid_symptom__meta_date    1
22  covid_symptom__meta_version 1
23  covid_symptom__nlp_results  4490835
24  covid_symptom__pcr  0
25  covid_symptom__prevalence_ed    0
26  covid_symptom__site_ed_note 6
27  covid_symptom__study_period 845146
28  covid_symptom__symptom_ctakes_negation  150756

new

#   table_name  con
1   covid_symptom__count_dx_month   584
2   covid_symptom__count_dx_week    526
3   covid_symptom__count_prevalence_ed_month    0
4   covid_symptom__count_prevalence_ed_week 0
5   covid_symptom__count_study_period_month 17050
6   covid_symptom__count_study_period_week  52250
7   covid_symptom__count_symptom_month  164294
8   covid_symptom__count_symptom_week   271716
9   covid_symptom__define_age   111
10  covid_symptom__define_dx_icd10  1
11  covid_symptom__define_dx_snomed 23
12  covid_symptom__define_pcr   52
13  covid_symptom__define_pcr_custom    4
14  covid_symptom__define_pcr_negative  22
15  covid_symptom__define_pcr_positive  30
16  covid_symptom__define_period    3
17  covid_symptom__define_symptom   155
18  covid_symptom__dx   2226
19  covid_symptom__lib_transactions 39
20  covid_symptom__meta_date    1
21  covid_symptom__meta_version 1
22  covid_symptom__nlp_results  4490835
23  covid_symptom__nlp_results_term_exists  3452580
24  covid_symptom__pcr  0
25  covid_symptom__prevalence_ed    0
26  covid_symptom__site_ed_note 6
27  covid_symptom__study_period 2215282
28  covid_symptom__symptom_ctakes_negation  150861
29  covid_symptom__symptom_icd10    20914

We're accepting the no PCR count for now as a defect of our internal implementation and will follow on with a dot release here/in the library to try to address.

version = "0.2.2"
version = "1.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎊

Comment on lines -27 to +30
o.lab_code AS covid_pcr_code,
o.lab_date AS covid_pcr_date,
o.lab_week AS covid_pcr_week,
o.lab_month AS covid_pcr_month,
o.observation_code AS covid_pcr_code,
o.effectivedatetime_day AS covid_pcr_date,
o.effectivedatetime_week AS covid_pcr_week,
o.effectivedatetime_month AS covid_pcr_month,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new field names are much clearer (if you are familiar with the FHIR spec anyway, and that is probably a given for the audience).

Though... I'm already wondering what you'll do once you decide to abstract away the effectiveX fields - like if core decides to also examine effectiveInstant or effectivePeriod if they are present instead of datetime - do we have a naming convention for that kind of abstraction of multi-format fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no - we're are lightly overloading fields in some places (like period.end, which is often null and we coalesce with period.start).

We can talk about this a bit maybe outside the context of this PR, but I sort of think that's ok as part of the core study's simplification promise - if for some reason it's not, you can roll your own.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - but I want the simplification promise to go even further 😄 - give me just effective_day and that will be the day for whichever of the effectiveX fields happened to be filled out. That's right in core's wheelhouse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observation.effective especially triggers me because it offers two fields that are so very similar: effectiveDateTime and effectiveInstant - the only difference being whether the field can hold sub-seconds. But now all consumers have to check both fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the effective day promise - like, i'm fine with saying that should always resolve to a reasonable thing.

week and month are trickier, if only because they are very common binning units for studies - we're basically obviating away date casting downstream at the cost of space on disk. I'm sort of ok with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for posterity - after clarifying some stuff off thread, we created an issue in core to simplify this kind of rat king type handling smart-on-fhir/cumulus-library#205

SELECT 2 AS data_package_version;
SELECT 3 AS data_package_version;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of the enc_class_code -> enc_class_display field change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also that, with a breaking changes in the upstream tables, it felt like a good idea to provide a hard deliniation here.

@dogversioning dogversioning merged commit bdee70f into main Apr 4, 2024
2 checks passed
@dogversioning dogversioning deleted the mg/v2-cutover branch April 4, 2024 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants